SMILES Extensions for Pattern Matching and Molecular Transformations: Applications in Chemoinformatics

نویسندگان

  • Richard G. A. Bone
  • Michael A. Firth
  • Richard A. Sykes
چکیده

The selection and modification of atoms or functional groups underly many of the manipulations central to molecular modeling. It has become even more important to automate these tasks with the current prevalence of work with large databases of molecules. We have devised SUPER-SMILES, a conceptually simple set of extensions to the SMILES line notation, whose key features are addition and deletion facilities, macros, atom tagging, disjunctions, and constraints. This superset of SMILES enables us to carry out transformations on individual molecular structures or across members of a database with a pattern-matching protocol. The principal advantage of SUPER-SMILES is the ability to specify chemical reactions with a very simple augmentation of the SMILES line notation. For example, in conjunction with macros, it is possible to represent the displacement of tosylate with phenoxy by the expression “(Delete Tosyl) (Add Phenoxy)”. SUPERSMILES thus represents a unified approach to molecular structure specification and modification and can easily be applied to large datasets of molecules. This functionality has been implemented within the PROMETHEUS suite of CAMD programs. We demonstrate its use in carrying out such operations as atomtype assignment, protonation of molecules, valency checking, and hydrogen addition. Further applications such as library design and construction immediately suggest themselves.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Entropy of infinite systems and transformations

The Kolmogorov-Sinai entropy is a far reaching dynamical generalization of Shannon entropy of information systems. This entropy works perfectly for probability measure preserving (p.m.p.) transformations. However, it is not useful when there is no finite invariant measure. There are certain successful extensions of the notion of entropy to infinite measure spaces, or transformations with ...

متن کامل

Evaluation of Similarity Measures for Template Matching

Image matching is a critical process in various photogrammetry, computer vision and remote sensing applications such as image registration, 3D model reconstruction, change detection, image fusion, pattern recognition, autonomous navigation, and digital elevation model (DEM) generation and orientation. The primary goal of the image matching process is to establish the correspondence between two ...

متن کامل

A Fully Reversible Data Transform Technique Enhancing Data Compression of SMILES Data

The requirement to efficiently store and process SMILES data used in Chemoinformatics creates a demand for efficient techniques to compress this data. General-purpose transforms and compressors are available to transform and compress this type of data to a certain extent, however, these techniques are not specific to SMILES data. We develop a transform specific to SMILES data that can be used a...

متن کامل

Jmol SMILES and Jmol SMARTS: specifications and applications

BACKGROUND SMILES and SMARTS are two well-defined structure matching languages that have gained wide use in cheminformatics. Jmol is a widely used open-source molecular visualization and analysis tool written in Java and implemented in both Java and JavaScript. Over the past 10 years, from 2007 to 2016, work on Jmol has included the development of dialects of SMILES and SMARTS that incorporate ...

متن کامل

Treelet kernel incorporating cyclic, stereo and inter pattern information in chemoinformatics

Chemoinformatics is a research field concerned with the study of physical or biological molecular properties through computer science’s research fields such as machine learning and graph theory. From this point of view, graph kernels provide a nice framework which allows to naturally combine machine learning and graph theory techniques. Graph kernels based on bags of patterns have proven their ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of Chemical Information and Computer Sciences

دوره 39  شماره 

صفحات  -

تاریخ انتشار 1999